HYPROSP: a hybrid protein secondary structure prediction algorithm--a knowledge-based approach.

نویسندگان

  • Kuen-Pin Wu
  • Hsin-Nan Lin
  • Jia-Ming Chang
  • Ting-Yi Sung
  • Wen-Lian Hsu
چکیده

We develop a knowledge-based approach (called PROSP) for protein secondary structure prediction. The knowledge base contains small peptide fragments together with their secondary structural information. A quantitative measure M, called match rate, is defined to measure the amount of structural information that a target protein can extract from the knowledge base. Our experimental results show that proteins with a higher match rate will likely be predicted more accurately based on PROSP. That is, there is roughly a monotone correlation between the prediction accuracy and the amount of structure matching with the knowledge base. To fully utilize the strength of our knowledge base, a hybrid prediction method is proposed as follows: if the match rate of a target protein is at least 80%, we use the extracted information to make the prediction; otherwise, we adopt a popular machine-learning approach. This comprises our hybrid protein structure prediction (HYPROSP) approach. We use the DSSP and EVA data as our datasets and PSIPRED as our underlying machine-learning algorithm. For target proteins with match rate at least 80%, the average Q3 of PROSP is 3.96 and 7.2 better than that of PSIPRED on DSSP and EVA data, respectively.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A knowledge-based hybrid method for protein secondary structure prediction based on local prediction confidence

Motivation: In our previous approach, we proposed a hybrid method for protein secondary structure prediction, called HYPROSP, which combined our proposed knowledge-based prediction algorithm PROSP and PSIPRED. The knowledge base constructed for PROSP contains small peptides together with their secondary structural information. The hybrid strategy of HYPROSP uses a global quantitative measure, m...

متن کامل

HYPROSP II-A knowledge-based hybrid method for protein secondary structure prediction based on local prediction confidence

MOTIVATION In our previous approach, we proposed a hybrid method for protein secondary structure prediction called HYPROSP, which combined our proposed knowledge-based prediction algorithm PROSP and PSIPRED. The knowledge base constructed for PROSP contains small peptides together with their secondary structural information. The hybrid strategy of HYPROSP uses a global quantitative measure, mat...

متن کامل

Protein Secondary Structure Prediction: a Literature Review with Focus on Machine Learning Approaches

DNA sequence, containing all genetic traits is not a functional entity. Instead, it transfers to protein sequences by transcription and translation processes. This protein sequence takes on a 3D structure later, which is a functional unit and can manage biological interactions using the information encoded in DNA. Every life process one can figure is undertaken by proteins with specific functio...

متن کامل

A Knowledge-Based Approach to Protein Local Structure Prediction

Local structure prediction can facilitate ab initio structure prediction, protein threading, and remote homology detection. However, previous approaches to local structure prediction suffer from poor accuracy. In this paper, we propose a knowledge-based prediction method that assigns a measure called the local match rate to each position of an amino acid sequence to estimate the confidence of o...

متن کامل

ROBUST RESOURCE-CONSTRAINED PROJECT SCHEDULING WITH UNCERTAIN-BUT-BOUNDED ACTIVITY DURATIONS AND CASH FLOWS I. A NEW SAMPLING-BASED HYBRID PRIMARY-SECONDARY CRITERIA APPROACH

This paper, we presents a new primary-secondary-criteria scheduling model for resource-constrained project scheduling problem (RCPSP) with uncertain activity durations (UD) and cash flows (UC). The RCPSP-UD-UC approach producing a “robust” resource-feasible schedule immunized against uncertainties in the activity durations and which is on the sampling-based scenarios may be evaluated from a cos...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Nucleic acids research

دوره 32 17  شماره 

صفحات  -

تاریخ انتشار 2004